Visualizing Temporal Periodicity in Categorical Datasets
نویسندگان
چکیده
When analysing time series data, identifying periodic patterns is a common task. Several methods have been proposed to locate such patterns, many of them utilizing visualizations. One such visualization proposed in the past has been the Spiral Display, a method in which time series data is mapped onto a spiral representation of a timespan. Using an appropriate period for this spiral, periodic patterns reveal themselves radially. Most implementations of the spiral display to date have focused on quantitative, continuous and univariate data, using encoding schemes which are consistent with that premise. Here, an alternative implementation of the Spiral Display is presented, which deals primarily with categorical, discrete and multivariate data with a glyph-based representation of datapoints, using colour hue and shape to encode attribute data. In order to facilitate exploratory analysis, several interaction features were added, including controls for detail, overview, suppression, encoding control, spiral control and animation of the spiral period. Several methods for visualizing time series data are also evaluated, along with techniques for interaction and data encoding which had bearing on the development of the implementation. User tests have shown that the implementation is an effective means of finding periodic patterns in time series data of the format above. Tests have revealed several factors which affect the difficulty of finding such patterns as well as highlighting the importance of interaction features such as animation, encoding and suppression. Testing has also revealed that, while powerful, the Spiral Display requires an amount of learning to utilize properly and that there is a tendency among users to rely on more conventional means of determining pattern periods when that is feasible. Referat Visualisering av tidsmässig periodicitet i kategoriska datamängder Vid analys av tidsbaserad data är en vanlig uppgift att finna periodiska mönster. Ett flertal metoder har blivit presenterade för att upptäcka sådana mönster varav många använder sig av visualiseringar. En av de visualiseringar som använts tidigare är spiraldisplayen, vilken är en metod i vilken tidsdata överförs och visas på en spiralrepresentation utav ett tidsspann. När en lämplig period för denna spiral används så visar sig eventuella mönster radiellt. De flesta implementationer utav spiraldisplayen har hittills fokuserat på kvantitativ och kontinuerlig data med en variabel och använder visuella kodningsmetoder som passar den premissen. I denna rapport presenteras en alternativ implementation av spiraldisplayen som primärt fokuserar på tidsmässig data som är kategorisk, diskret och utgörs av multipla variabler. Detta åstadkoms genom att använda sig av en glyfbaserad representation utav datapunkter, med färgkulör och form för att koda attributdata. För att underlätta utforskande analys utav datamängden så tillades ett flertal interaktionskomponenter, inkluderande kontroller för detaljvy, översiktsvy, undertryck, kodningskontroll, spiralkontroll samt animation av spiralperioden. Ett flertal metoder för visualisering utav tidbaserad data evalueras även samt tekniker för interaktion och datakodning som haft en inverkan på implementationen. Användartester har visat att implementationen utgör ett effektivt verktyg för att finna periodiska mönster i tidsbaserad data enligt mönstret ovan. Tester har visat ett flertal faktorer som påverkar svårigheten i att finna sådana mönster samt understryker vikten av interaktionsfunktioner såsom animation, kodning och undertryck. Tester har även påvisat att spiraldisplayen, trots dess styrkor, kräver en viss inlärning innan den kan användas effektivt samt att det finns en tendens hos användare att förlita sig på mer konventionella metoder för att bestämma mönsterperioder när detta visar sig möjligt.
منابع مشابه
Visualizing Relationships among Categorical Variables
Centuries of chart-making have produced some outstanding charts tailored specifically to the data being visualized. They have also produced a myriad of less-than-outstanding charts in the same vein. I instead present a set of techniques that may be applied to arbitrary datasets with specific properties. In particular, I describe two techniques – Nested Category Maps and Correlation Maps – for v...
متن کاملMINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS
This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...
متن کاملOn Summarizing Large-Scale Dynamic Graphs
How can we describe a large, dynamic graph over time? Is it random? If not, what are the most apparent deviations from randomness – a dense block of actors that persists over time, or perhaps a star with many satellite nodes that appears with some fixed periodicity? In practice, these deviations indicate patterns – for example, research collaborations forming and fading away over the years. Whi...
متن کاملVisualizing and Modeling Categorical Time Series Data
Categorical time series data can not be eeectively visualized and modeled using methods developed for ordinal data. The arbitrary mapping of categorical data to ordinal values can have a number of undesirable consequences. New techniques for visualizing and modeling categorical time series data are described, and examples are presented using computer and communications network traces.
متن کاملSpatio-temporal variability of aerosol characteristics in Iran using remotely sensed datasets
The present study is the first attempt to examine temporal and spatial characteristics of aerosol properties and classify their modes over Iran. The data used in this study include the records of Aerosol Optical Depth (AOD) and Angstrom Exponent (AE) from MODerate Resolution Imaging Spectroradiometer (MODIS) and Aerosol Index (AI) from the Ozone Monitoring Instrument (OMI), obtained from 2005 t...
متن کاملتخمین مکان نواحی کدکننده پروتئین در توالی عددی DNA با استفاده پنجره با طول متغیر بر مبنای منحنی سه بعدی Z
In recent years, estimation of protein-coding regions in numerical deoxyribonucleic acid (DNA) sequences using signal processing tools has been a challenging issue in bioinformatics, owing to their 3-base periodicity. Several digital signal processing (DSP) tools have been applied in order to Identify the task and concentrated on assigning numerical values to the symbolic DNA sequence, then app...
متن کامل